Visualizing Dynamics: from t-SNE to SEMI-MDPs

نویسندگان

  • Nir Ben-Zrihem
  • Tom Zahavy
  • Shie Mannor
چکیده

Deep Reinforcement Learning (DRL) is a trending field of research, showing great promise in many challenging problems such as playing Atari, solving Go and controlling robots. While DRL agents perform well in practice we are still missing the tools to analayze their performance and visualize the temporal abstractions that they learn. In this paper, we present a novel method that automatically discovers an internal Semi Markov Decision Process (SMDP) model in the Deep Q Network’s (DQN) learned representation. We suggest a novel visualization method that represents the SMDP model by a directed graph and visualize it above a t-SNE map. We show how can we interpret the agent’s policy and give evidence for the hierarchical state aggregation that DQNs are learning automatically. Our algorithm is fully automatic, does not require any domain specific knowledge and is evaluated by a novel likelihood based evaluation criteria.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Visualizing Data using t-SNE

We present a new technique called “t-SNE” that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map. The technique is a variation of Stochastic Neighbor Embedding (Hinton and Roweis, 2002) that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map. ...

متن کامل

Visualizing breast cancer data with t-SNE

One in eight women will get breast cancer in her lifetime and in 2008 it has caused 458.503 deaths among the world [15]. Despite that technology has made considerable improvements in the last decades, there is still room for more advances. A technique that possibly can contribute to this field is t-SNE [24]. The aim of this thesis is to investigate whether t-SNE is able to present the breast ca...

متن کامل

Visualizing Time-Dependent Data Using Dynamic t-SNE

Many interesting processes can be represented as time-dependent datasets. We define a time-dependent dataset as a sequence of datasets captured at particular time steps. In such a sequence, each dataset is composed of observations (high-dimensional real vectors), and each observation has a corresponding observation across time steps. Dimensionality reduction provides a scalable alternative to c...

متن کامل

Graph Layouts by t-SNE

We propose a new graph layout method based on a modification of the t-distributed Stochastic Neighbor Embedding (t-SNE) dimensionality reduction technique. Although t-SNE is one of the best techniques for visualizing high-dimensional data as 2D scatterplots, t-SNE has not been used in the context of classical graph layout. We propose a new graph layout method, tsNET, based on representing a gra...

متن کامل

Supplemental Material for Visualizing Data using t - SNE

In this supplementary material, we present the results of our experiments that compare the visualizations produced by t-SNE with those produced by seven other dimensionality reduction techniques on five datasets from a variety of domains. Some of these results were already presented in the paper, however, we present the results here in a different form. The five datasets we employed in our expe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1606.07112  شماره 

صفحات  -

تاریخ انتشار 2016